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The present invention relates generally to computer controlled printing of documents having text and graph- 
ical components, and particularly to methods and systems for translating documents, represented in a struc- 
tured page description language, into a number of different formats suitable for use with a variety of printing 
devices and also for transmission to other devices such as facsimile transceivers. 

BACKGROUND OF THE INVENTION 

Prior to the introduction of laser printers in 1980,.the control commands transmitted by computers to prin- 
ters were so-called escape sequence commands because commands were distinguished from character data 
10 by preceding each command with a special byte called the escape character. This methodology worked well 
with daisy wheel and dot matrix printers, but was not well suited for printing documents that combined text and 
graphical images. 

A new type of printer control methodology,; using a "Page Description Language" (PDL) was developed to 
control laser printers. Various PDL's were developed in the 1980s, the best known examples being PostScript 

15 (a trademark of Adobe Systems Incorporated) and Interpress, although a number of proprietary PDL's are used 
by different printers. These prior art PDLs introduced many useful printer control methodologies, including such 
tools as Resource Declarations, Context Declarations, Dictionaries* the use of memory stacks, as well as a 
large number of predefined commands for defining specific graphical image elements, for controlling the con- 
: . tents of the printer controller's memory, and so on. These features of the prior art PDLs are extensively docu- 
.20 mented in publicly available manuals such as Adobe System flncorporated's "PostScript Language Reference 
Manual" and its "PostScript Language Program Design"; both published Addison- Wesley Publishing Company. 
Another publication concerning PDL's are "Interpress, The Source Book" by Steven J. Harrington and Robert 
R. Buckley, published by Simon & Schuster, Inc. (1988). A publication concerning a proposed standard page 
description language (SPDL) which organizes documents in a hierarchical manner is "I SO/I EC DIS 10180, In- 

25 formation Processing - Text Communication - Standard Page Description Language" (1991). 

One shortcoming of PostScript is the fact that the Page Description for a particular document can contain 
new definition, such as a new resource definition (e.g., for an additional font to be used in the document) or 
a new dictionary definition anywhere within the document. As a result, the entire contents of the document 
must be inspected in order to determine whether a particular printer has the resources necessary to print a 

30 particular document. Alternately, it is quite possible for the printing of a document to fail at any point during 
the printing process due to the inability of the printer to comply with the commands of the document's page 
description. r 

Another problem associated with PostScript is that in order print a specified page of a document, it is nec- 
essary to read the entire PDL description of all the preceding pages of the document in order to determine 

35 the state of the documents page setup parameters (i.e.. Resource Declarations, Dictionary Definitions, and 
so on) at the beginning of the specified page. In other words, the print controller or a print driver program must 
read the entire PDL description of the document to take into account the effect ofevery page setup command 
between the beginning of the document and the specified page. While this page setup scanning process is 
straightforward, it is wasteful. Adobe System Inc. has proposed programming conventions to avoid or reduce 

40 this problem. 

Interpress uses free formatted prologues which can be used to avoid the above problem. The Standard 
Page Description Language proposed by ISO on the other hand uses prologues having a predetermined fixed 
format. 

Other shortcomings in the prior art include the failure to provide systems which are designed to translate 
45 documents between various printer command formats, various page description languages, as well as other 
types of devices such as facsimile machines. 

The present invention provides an improved image processor, which processes documents represented 
by statements in a structured page description language (such as the Standard Page Description Language 
proposed by ISO), converts documents between a variety of different document description formats, and also 
so transmits documents to remote devices in accordance with the resources available at those remote devices, 
thereby using the most efficient data transmission format which is compatible with the receiving remote device. 

SUMMARY OF THE INVENTION 

55 In summary, the present invention is a document processing system for controlling the printing of docu- 

ments represented in page description language form. Documents are represented by a page description lan- 
guage which is structured so that definition and declaratory commands are positioned only at the beginning 
of each distinct document segment More specifically, each document has optional prologue sections, which 
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contain definition and declaratory commands, and content portions which contain the specific tokens or com- 
mands for defining specific images. Furthermore, the definition and declaratory commands in the prologue 
sections of the document are arranged in a hierarchical tree so that each definition and declaratory command 
has a scope corresponding to the portion of the hierarchical tree subtended by that command. 
5 The document processing system includes several distinct sets of software for processing different por- 

tions of each document. A Structure Processor handles resource declaration and definitions, dictionary gen- 
eration, context declarations and references to data external to the document. A Content Processor processes 
the tokens using the definitions and declarations set up by the Structure Processor. In addition, an Imaging 
Driver Module translates the document into commands suitable for any of several types of printers, as well as 
10 for communication by telephone line to a remote device. 

An important advantage of the present invention is that any specified portion of a document can be proc- 
essed or printed without having to process the entire document prior to the specified portion. Only structural 
definitions in the hierarchical tree above the specified document portion need to be processed. This is both 
efficient, and also facilitates determination of the resources needed by the document prior to commencing ac- 
ts tual printing of the document This feature is useful not only when printing a document, but also when trans- 
mitting a document to a remote device. In accordance with th9 present invention, the document processing 
system queries the remote device to determine whether it has the resources required for receiving and proc- 
essing a PDL, PostScript or HP Laserjet encoded document, and then either transmits the document in ras- 
terized form (e.g., to a fax machine or to other "raster output engine" devices such as a laser printer connected 
20 to the document processing system via a video port) if the required resources are not available, or transmits 
the document in a higher level encoded form if the required resources are available. By determining the re- 
sources available to the remote device, the most efficient transmission format can be used, thereby reducing 
transmission costs. 

25 BRIEF DESCRIPTION OF THE DRAWINGS 

Additional objects and features of the invention will be more readily apparent from the following detailed 
description and appended claims when taken in conjunction with the drawings, in which: 

Figures 1A and 1B are block diagrams of a document represented in structured PDL form. \i 
30 Figure 2 is a block diagram of the hardware in a document processing system. 

Figure 3 is a block diagram of the software modules in the preferred embodiment. . 
Figure 4 depicts the primary data structures used by the software modules in the preferred embodiment. 
Figures 5A and 5B are flow charts of the process of parsing a document by the Lexical Analyzer and Parser 
of the preferred embodiment. 
35 Figure 6 is a flow chart of the process of interpreting structural commands by the Structure Processor of 

the preferred embodiment. 

Figure 7 is a block diagram depicting operation of the Content Processor of the preferred embodiment 
Figure 8 is a block diagram depicting operation of the Imaging Driver Module of the preferred embodiment. 
Figure 9 is a block diagram of the communication processor software of the preferred embodiment for 
40 transmitting documents to remote devices. 

DESCRIPTION OF THE PREFERRED EMBODIMENT 

Hierarchically Structured Page Description Language 

45 . 

Referring to Figures 1 A and 1 B, a document 1 00 used by the present invention is represented by a set of 
■ page description language (PDL) elements which divided into pages sets, and pictures. Both page sets and 
pictures can have prologue sections that define structural elements of the document, and these prologue sec- 
tions are organized in a hierarchical fashion so that the declarations and definitions in each prologue are ap- 
so plicable only to the subset of the document that is subtended by that prologue in the document's hierarchical 
structure. 

A top level page set 102 which sets up resources, dictionary and external definitions useable by the entire 
document 

For the purposes of this description, resources, dictionary and external definitions perform the same basic 
55 functions as they do in the PostScript page description language. For those not familiar with these terms, the 
following short definitions are provided. Resource declarations and definitions specify fonts, filters, fill pat- 
terns, colors, glyphs and so on, which are then available to be invoked by tokens in the document Resource 
declarations bind a name to each specified resource, while resource definitions specify the exact nature of 

3 
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each resource. Dictionaries are used to translate key values into specified list of tokens or other values, and 
thus is similar to a list of macro definitions.. External definitions reference data structures external to a docu- 
ment one example being the image of a corporate logo that is to be printed at the top or bottom of letters and 
memoranda. 

5 In addition to the top level page set 102 are a set of second level of page sets 104, 106. Each page set 

104 has a prologue 110, body 112 and end 114. The page set prologue 110 provides resource definitions 110- 
Aand declarations 110-B, dictionary generation statements 110-Cand external definitions 110-D used by sub- 
sections of the document The body 112 of a page set consists of one or more pictures 120. A picture 120 cor- 
responds to a contiguous segment of a document, such as a page, a portion of a page, or possibly a sequence 
10 of several pages. . , 

The data structure of each picture 1 20 comprises an optional prologue 1 22 followed by a picture body 1 24. 
The picture body 1 24 can contain one or more sub-pictures 126, as well as a token sequence 1 28 which defines 
the images in one segment of the document (e.g., the image elements for one page). Picture prologues 122 
provide resource declarations and definitions, dictionary generation statements and context definitions used 
15 solely by that one picture (i.e., that segment of the document). 

Looking at Figures 1A-1 B as a whole, the data structure embodying document 100 is structured hierarch- 
. ically. Prologues, containing definition and declaratory commands, are positioned only at the beginning of each 
distinct document segment. More specif ically, each document has optional prologue sections, which contain 
. definition and declaratory commands, and content portions which contain the specif ic tokens or commands 
20 for clefining specific images. Furthermore, the^definition and declaratory commands in the prologue sections 
. of the document are arranged in a hierarchical tree so that each definition and declaratory command has a 
scope corresponding to the portion of the hierarchical tree subtended by that command. Thus, for instance, 
the definition and declaration commands in the prologue of Page Set 1 apply only to the pictures in that page 
set and therefore do not apply, to the pictures in Page Set N. 
25 A formal definition of the data structure of a document is shown in Table 1. 



; table l V, ; 

so . PDL Definitions 

document (pageeet | picture) 

pageeet (prologue)?,. (pageset__ body) ? 
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prologue : := (external^decl*, informati ve_decl * , 

recource_def*, resource decl*„ doc prod inst decl*. 
context_decl* f! dict_gen__decl*, set up nroc*) 

pageset_body ::» (pageeet | picture) 

picture : := (prologue)?, (picture_body) ? . 

picture. body : := (picture | tokenseguence ) * 

, All must occur in the order shown. 

| One and only one must occur. 

? Optional (0 or 1 time) 

* Optional and repeatable (0 or more times) 



Image Processor Hardware 

Referring to Figure 2, the Image Processor 150 of the present invention is preferably implemented as a 
stand alone computer system having a central processing unit (CPU) 152 such as the AMD 29000 made by 
Advanced Micro Devices or any of the Motorola 68000 series microprocessors, and random access memory 
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when processing of page set 1 ended and processing of another page set began, because the dictionary de- 
fined in page set 1 is not applicable to the other page sets. ...... 

Table 2 represents an example of a very short document using a page description language structured in 
accordance with the present invention. This document defines multiple dictionaries and resets their order wit h- 
5 in the image processor's "dictionary stack", thereby changing the order in which the dictionaries are searched 
for key values. 
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TABLE 2 
Example of Document 

< 1 DOC TYPE SPDL PUBLIC "XXX" > 

<SPDL> 

<comment> 

Test Pile Header 

Copyright 1991 Ricoh Corporation 
All Rights Reserved 
Confidential and Proprietary 

File : ctxtdcll.pro 

Author i Tetsuro Motoyama 

Version : 0.01 

File Created: June 5, 1991 
First Draft : June 5, 1991 



Update History : 
Description : 

This is a test file for SPDL syntax checking. It creates 
three dictionaries through the Diet. Gen. and manipulates 
the context stack by context declaration. 

Hote: Put PSEUDO codes before each SPDL test file. 

- SPDL PSEUDO CODE LISTING - 
SPDL 

30 Document^ picture % three dictionaries are defined 

prologue . 

Dictionary Generator diet id-alpha size 3 
Dictionary Generator dictidobeta size 4 
Dictionary Generator diet ids gamma size 5 
pictbody 

35 tokenseq 2 3a %expect the a in the gamma to be executed 

picture 
prologue 

context deel gamma beta alpha 
pictbody 

tokens eq 2 3a %expect the a in the alpha to be executed 

40 </ common t> 

<document> 

^picture spdlld«*SPDL M 
cntnttyp«"SPDLCl«arText - > 

< comment > spldid and cntnttyp are Public Object ZD values 
45 < /comment > 

<prologue> 

<dictgens> 

<dictgen sizes" 3 " ><dictid> <name>alpha< /namex /dictid> 
<tokenseq> 

% The operand stack has the dictionary reference 
50 dup /a (add) put 

dup /d (dlv) put 
dup /m (mul) put 
</tokenseqx/dictgen> 
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<dictflon *±zo- M 4" ><dictldxziax&*>b«t&< /nanex /dlctid> 
<tokaD0«<a> 

dup /a (div) put 
dup . /d (mul) put 
dup /m {add} put 
dup /» {sub} put 
< /tokens ecj> ' 
</dictgen> 

<dictgen 8izeB M 5"><dictid><name>gainma</naine></dictid> 
<toXen0eq> . 

dup /a (mul } put 
dup : /d {add} put 
dup /m {sub} put 
</tokanseq> 
<tokenoeq> 

dup /» {div} put 
dup 7c {cob} put 
</toXen»eq> - 
</dictgen> 

</dlctgens> v _ j v .. . - - v 

<conment> ^. . . 

* dictionary stack'" 

**" g{Qjxzna ■ 1 ' ' 

■ ; ■■ ^be'ta;-'. " .. •■ •; 

. alpha ^ : : 
< /connnent> 
</prologue> 

,<picbody> ^ . 

25 • ■ <tokenaeq> ■ " 

2 3a ^expect the result to be 6 
</tokenseq> 
<picture> 

<prologue> 

■ <ctxtdecl> 

30 • ' ; • • • 

<name>gaimria< /name ><naine>beta< /naine><naine> alpha< /name > . 

</ctactdecl> 
< /prologue > 
<contoent> 

dictionary stack 
35 . alpha (••■-•:,.-■ 

■ * beta . .. 
. gamma . 

</ comment > 

<picbody> : 
<tokenseq> 

40 2 .3 a Expect, the result to be 5 

</tokenseq>, 
< /picbody> . 
</picture> 
</picbody> 

</picture> < comment > printing the first page </ comment > 
45 < /document > 

</SPDL> 



50 The Structure Processor 202 allocates memory for storing dictionaries, but passes dictionary generation 

commands to the Content Processor for creating the specific entries in each defined dictionary. The Structure 
Processor 202 also pushes and pops dictionary pointers onto and off of the dictionary stack 224 so as to control 
the scope of each dictionary. The Content Processor 204 searches the dictionaries currently in the dictionary 
stack 224, starting with the last added dictionary, for "key" values in tokens. If two dictionaries in the stack 

55 224 having conflicting definitions for a particular key, the last entered definition for a specified key is the one 
that is used. 

Operand stack 226 is a standard operand stack used for temporarily storing parameter values to be used 
by imaging operators (called tokens) in the content portion of the document being processed. 

8 
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process the current element at step 284: More specifically, the Structure Processor contains routines for proc- 
essing every type of legal element allowed in the Page Description Language, and the appropriate one of these 
routines is called at step 284. 

5 Document Structure Processor 

Referring to Figure 6, the Structure Processor 202,contains the following software modules whose func- 
tions are briefly described by the text in Figure 6: document structure manager 300, external description man- 
ager 302, information declaration manager 304, resource definition manager 306, resource declaration man- 
10, ager308, document production instruction manager 310, dictionary stack manager 312, dictionary generator 
manager 31 4, setup procedure manager 31 6, token sequence manager 31 8, and structure error handler 320. 
These software modules decode corresponding elements in each prologue section of the document, and store 
representations of the resulting "printing command interpretation environment 11 as state parameters. 
. Token elements are passed by the Structure Processor to the Content Processor by token sequence man- 
15 ager318. 

Document production instructions, such as an instruction to print only pages. 7 through 10 of a document 
are handled initially by the document production instruction manager 310 so as to store the appropriate pro- 
duction control values in data structure 232, (see Figure 4). Thereafter, the document structure manager 300 
uses the stored document production .control values to skip over or discard sections of the document corre- 
20 sponding to un selected portions of the document, and to push and pop dictionaries onto the dictionary stack 
so as to provide the appropriate dictionaries for each section of the document that is selected for printing. 

Document Content Processor 

25 Referring to Figure 7, the token sequence manager 204 has a token manager 350 which receives each 

token element to be processed. There are three special cases in which the token element is not interpreted 
by the Content Processor 204. If the document is not being printed, but instead is being transmitted to a remote 
device capable of processing PDL documents, the token manager 350 passes the element unchanged to the 
Imaging Driver Module. Similarly, if the document is being printed by a PDL capable printer(i.e., a printer which 

30 can interpret PDL documents), the content tokens are passed unchanged by the token handler to the Imaging 
Driver Module. Finally, rf the document is being printed by a PostScript compatible printer (i.e., a printer which 
can interpret PostScript language documents), the content tokens are converted into equivalent PostScript tok- 
ens (using a simple table look-up conversion methodology) by the Token Handler 350 and then passed to the 
. Imaging Driver Module. 

. 35 Assuming that none of the exceptional cases just described above applies, and that therefore the token 

element received needs to be interpreted and converted into image data, the required content processing pro- 
ceeds as follows. Operands for the element are pushed onto the operand stack by an operand stack handler 
352, and then the operator portion of the element is "executed" or interpreted by operator execution controller 
354. While interpreting token elements, the operator execution controller 354 uses the parameters previously 

40 pushed onto the stack and generates imaging parameter values, represent portions of the document being 
processed. The imaging parameter values are passed to the Imaging Driver Module. 

A dictionary handler 356 is called by the token handler to convert parameter keys into parameter strings, 
and is called by the operator execution controller 354 to convert operator keys into corresponding strings of 
operator, and sometimes parameter, values. Finally, errors such as stack underflows and overflows, and ref- 

45 erences to undefined keys or to a nonexistent dictionary are handled by a content error handler 358. 

* Imaging Driver Module 

Referring to Figure 8, the Imaging Driver Module 206 has an option handler that determines which of sev- 
so eral imaging driver programs 372-378 are to be used to process the document. Driver programs are provided 
for PostScript printers, HP Laserjet and HP Laserjet emulation printers, bit map printers, and for sending docu- 
ments to remote devices. A bit map printer is one does not accept high level commands and therefore must 
be sent an image in the form of a rasterized bit map^ A rasterizer routine 380 converts each page of the docu- 
ment into a bit mapped image, which can then be transmitted to a device such as a simple dot matrix or ink jet 
55 printer that does not have a built-in PDL or PostScript printer. The rasterizer routine 380 is also used when 
printing to an HP Laserjet or compatible printer for converting all portions of the document, excepting those 
text portions for which the printer has corresponding built-in or downloaded fonts, into bit map form. 

The Imaging Driver Module 206 runs in parallel with the other modules of the image processor. As imaging 

10 
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5. The document image printing controller of claim 1 , wherein said imaging driver processor includes a printer 
driver and a rasterizerfor converting said imaging instructions into an image bit map and then transmitting 
said image bit map to a second printer port 

5 6. The document image printing controller of claim 1 ..wherein said imaging driver processor includes a first 
printer driver for transmitting said imaging instructions to a first printer port using high level page descrip- 
tion language commands, and a second printer driver and a rasterizer for converting said imaging instruc- 
tions into an image bit map and then transmitting said image bit map to via a communication port to a 
remotely located printing device; 

10 said document image printing controller including means for generating a required resource list rep- 

resenting a set of resources which would be required for a remotely located printing device to print said 
received document were said received document to be transmitted to said remotely located printing device 
using high level page description language commands; : 

said second printer driver including protocol means for querying said remotely located printing de- 

75 vice to determine whether said remotely located. printing device has resources corresponding to those 

represented by said required resource list means for transmitting said imaging instructions using high 
level page description language commands when said remotely located printing device responds affir- 
matively to said querying by said protocol means, and means for transmitting said imaging instructions 
as an image bit map when said remotely located printing device does not respond affirmatively to said 

20 querying by said protocol means. 

7. The document image printing controller of claim 1, wherein •''.:< : . . - 

said predefined printer device is configured to. receive documents in, PostScript language form; 
said printer imaging driver processor includes structure converting means for converting said stor- 
ed representations of said prologue elements into -corresponding PostScript commands; 

said content processor includes content converting means for converting said content sections of 
said received document into corresponding PostScript commands; and 

said printer imaging driver processor includes means for transmitting said corresponding PostScript 
commands to said predefined printer device. 

8. The document image printing controller of claim 1 , wherein said predefined printer device is configured 
to receive documents in PostScript language form; 

said document image printing controller including: structure converting means for converting said 
stored representations of said prologue elements into corresponding, PostScript commands, and content 
converting means for converting said content sections of said received document into corresponding 
PostScript commands; and 

said printer imaging driver processor including means for transmitting said corresponding Post- 
Script commands to said predefined printer device. 

9. A document image printing controller, comprising: 
means for receiving a document represented by a stream of page description language elements 

which define said document as a hierarchical tree structure, said received document including a hierarch- 
ically ordered set of prologue sections containing prologue elements and content sections containing im- 
age defining token elements; wherein said prologue elements in each said prologue section are applicable 
only to those of said content sections subtended by said prologue section in the document's hierarchical 
tree structure; 

a content processor for processing said content sections of said received document and generating 
corresponding imaging instructions for a predefined printer device; 

a document structure processor which processes said prologue elements, and stores representa- 
tions thereof in a computer memory so that stored representations of only those prologue elements ap- 
plicable to each content section of said document are available to said content processor while processing 
each said content section of said document; and 

an imaging driver processor for converting said imaging instructions into an image bit map and 
transmitting said image bit map to said predefined printer device. 

55 10. A method of printing a document, the steps of the method comprising: 

receiving a document represented by a stream of page description language elements which define 
said document as a hierarchical tree structure, said received document including a hierarchically ordered 
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